Random forest algorithm in big data environment

نویسندگان

  • Liu Yingchun
  • Yingchun Liu
چکیده

Random forest method is one of the most widely applied classification algorithms at present. From the actual big data scene and requirements, the application of random forest method in the big data environment to conduct in-depth study. Due to the big data needs to process a huge number of features at the same time, and the data pattern changes constantly over time, the accuracy of a random forest algorithm without self-renewal and adaptive algorithm will gradually reduce over time. Aiming at this problem, analysis on the characteristics of random forest method, presents how to realize the self-adaptation ability with random forest method in similar situations, and verified the feasibility of the new method of using the actual data, and analysis and discussion of how to further research and improve the random forest method in big data environment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Implementation of Random Forest Algorithm in Order to Use Big Data to Improve Real-Time Traffic Monitoring and Safety

Nowadays the active traffic management is enabled for better performance due to the nature of the real-time large data in transportation system. With the advancement of large data, monitoring and improving the traffic safety transformed into necessity in the form of actively and appropriately. Per-formance efficiency and traffic safety are considered as an im-portant element in measuring the pe...

متن کامل

Energy Efficient Data Mining Scheme for Big Data Biodiversity Environment

In this paper, we propose a novel energy efficient data mining scheme for big data biodiversity environment. Efficient machine learning and data mining techniques provide an unprecedented opportunity to monitor and characterize big data biodiversity environments, such as forest cover type, monitored using low cost wireless sensor networks. However, given the sheer amount of data collected by th...

متن کامل

Diagnosis of Diabetes Using a Random Forest Algorithm

Background: Diabetes is the fourth leading cause of death in the world. And because so many people around the world have the disease, or are at risk for it, diabetes can be called the disease of the century. Diabetes has devastating effects on the health of people in the community and if diagnosed late, it can cause irreparable damage to vision, kidneys, heart, arteries and so on. Therefore, it...

متن کامل

Integrative random forest for gene regulatory network inference

MOTIVATION Gene regulatory network (GRN) inference based on genomic data is one of the most actively pursued computational biological problems. Because different types of biological data usually provide complementary information regarding the underlying GRN, a model that integrates big data of diverse types is expected to increase both the power and accuracy of GRN inference. Towards this goal,...

متن کامل

Improvement of Support Vector Machine and Random Forest Algorithm in Predicting Khorramabad River Flow Uusing Non-uniform De-Noising of data and Simplex Algorithm

In this study, in order to simulate the monthly flow of the Khorramabad River, the time series of this river was decomposed into three levels using the wavelet of Daubechies-3, during the period of 1955-2014. Based on this, it was found that there is a Non-uniform noise that includes two periods of time in this signal, with the October 2008 border which required that the signal be become non-un...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015